Prognostic factors and predictive models for patients with lung large cell neuroendocrine carcinoma: Based on SEER database

Abstract Background Lung Large cell neuroendocrine carcinoma (LCNEC) is a rare, aggressive, high‐grade neuroendocrine carcinoma with a poor prognosis, mainly seen in elderly men. To date, we have found no studies on predictive models for LCNEC. Methods We extracted data from the Surveillance, Epidemiology, and End Results (SEER) database of confirmed LCNEC from 2010 to 2018. Univariate and multivariate Cox proportional risk regression analyses were used to identify independent risk factors, and then we constructed a novel nomogram and assessed the predictive effectiveness by receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis (DCA). Results A total of 2546 patients with LCNEC were included, excluding those diagnosed with autopsy or death certificate, tumor, lymph node, metastasis (TNM) stage, tumor grade deficiency, etc., and finally, a total of 743 cases were included in the study. After univariate and multivariate analyses, we concluded that the independent risk factors were N stage, intrapulmonary metastasis, bone metastasis, brain metastasis, and surgical intervention. The results of ROC curves, calibration curves, and DCA in the training and validation groups confirmed that the nomogram could accurately predict the prognosis. Conclusions The nomogram obtained from our study is expected to be a useful tool for personalized prognostic prediction of LCNEC patients, which may help in clinical decision‐making.


| INTRODUCTION
Lung large cell neuroendocrine carcinoma (LCNEC) is a rare, aggressive tumor with a poor prognosis.It is generally categorized in the non-small cell lung cancer (NSCLC) group because it is currently considered a neuroendocrine subtype of large cell lung cancer. 1 However, it is also a member of the group of lung neuroendocrine tumors, which are predominantly seen in older men and account for approximately 15% of all lung neuroendocrine tumors and 3% of all lung cancers. 2,3The poor survival outcome of LCNEC is mainly due to local recurrence and distant metastases, and its five-year survival rates range from 15% to 57%. 4 A previous study showed that the total age-adjusted incidence of LCNEC during 2000-2013 was 0.3/100000, of which 0.4/100000 and 0.3/100000 were in men and women, 5 respectively, and the incidence was on the rise during this period, with 5-year lung cancer-specific survival (LCSS) and overall survival (OS) of 20.7% and 16.7%.
LCNEC has a poor prognosis, low incidence, and is mostly disseminated.Due to its epidemiology and unique biology, there is an urgent need for multicenter, largesample prognostic studies.The SEER database is an open tumor database in the U.S., originating from multiple medical institutions and covering about 30% of the U.S. population.It can provide large sample data for tumor-related studies.Nomograms are now widely used to assess the prognosis of cancer patients because of their quantitative and intuitive nature. 6,7Therefore, we constructed a novel nomogram using the SEER database, which can be personalized to predict prognosis and help guide clinical decisions.were collected: (1) Demographic variables, including age, gender, race, and marital status.(2) Clinical information, including tumor pathological grade, TNM stage, distant metastases (lung, liver, bone, brain), radiotherapy, chemotherapy, and surgical treatment.

| Patients data collection
In determining the sample size for multivariate cox regression, we are usually based on the 10EPV (Events Per Variable, each variable eventually included in the regression model has 10 positive events, and if there are more positive events in the study, it should also be considered that the data of negative events meet 10EPV) method, which is widely used in model building studies. 8e estimate that eventually, about 9-11 variables enter the final multivariate cox regression.According to the 10EPV principle, positive events were greater than the number of negative event cases, so the number of negative events in the training group was at least 90 or 110.If randomly divided into the training group and the validation group at a ratio of 7:3, the number of negative events in the training group was about 126 cases (greater than 110 cases), so the sample size of the model we constructed was sufficient.Finally, 743 patients were included in this study and were randomized in a 7:3 ratio into a training group (70%) and a validation group (30%).We constructed and internally validated the nomogram with the training group, followed by external validation by the validation group.

| Statistical analysis
In this study, all statistical analyses were performed with R software (version 4.2.1), and a P value < 0.05 (bilateral) was considered statistically significant.All patients were randomly divided into training and validation groups, and the distribution of variables between the two groups was compared using the chi-square test.
For prognostic factor analysis, we used univariate COX regression analysis to identify factors associated with OS in patients with LCNEC.Then, significant variables (P < 0.05) were included, and a "forward LR" multivariate Cox regression analysis was used to further identify independent prognostic factors.We created a prognostic nomogram to predict 12, 24, and 36 months OS based on independent factors and calculated individual risk scores.We evaluated the efficacy of the model in three dimensions.First, we plotted the ROC curves over time and calculated the corresponding area under the curve (AUC) over time to assess discrimination.Second, calibration curves were plotted to assess the predictive accuracy of the nomogram.Finally, the DCA was performed to assess the model's efficacy in clinical practice.

| Characteristics of patients
A total of 743 patients were included.The age was predominantly 50-80 years, the majority were male 404 (54.4%), white 624 (84.0%), and grade III-IV 722 (97.2%) patients.All included patients were randomized to the training group (n = 520) and the validation group (n = 223) at a ratio of 7:3, while the chi-square test demonstrated no significant differences in all factors between the two subgroups (Table 1).

| Prognostic factors for LCNEC patients
Univariate and multivariate Cox regression analyses were used to screen for robust prognostic factors in the grouped training group, and the results showed that N stage, intrapulmonary metastasis (yes or no), brain metastasis (yes or no), bone metastasis (yes or no), and surgery at the primary site (yes or no) were independent prognostic factors (Table 2).

| Generation and validation of the prognostic nomograms
Based on the above five prognostic factors, a novel nomogram predicting OS at 12, 24, and 36 months in LCNEC patients was developed (Figure 2).Then, the ROC diagram showed that the AUC of 12 and 24 months nomogram reached 0.833 and 0.815 in the training group (Figure 3A) and 0.834 and 0.826 in the validation group (Figure 3B), respectively, with a good discriminatory ability for predicting, and the validation group was better than the training group.The calibration curves were plotted to assess the 12 and 24 months OS prediction nomogram efficacy, and the results showed strong agreement between the nomogram-predicted OS and the actual outcomes in the training group (Figure 4A,C) and the validation group (Figure 4B, D).In addition, The DCA also determined that the nomogram had good performance in clinical practice (Figure 5A-D).
We took advantage of the scientific, reliable, and abundant samples of the SEER database to complete this study.Most of the previous studies have discussed the prognostic factors associated with LCNEC, 5,9-11 and there is a lack of quantitative assessment tools.In this study, a quantitative prognostic tool for LCNEC was constructed for the first time, and it was verified to have good differential ability, predictive accuracy, and clinical efficacy.
Currently, the World Health Organization (WHO) classification defines LCNEC as NSCLC with histopathological features and immunohistochemical expression of neuroendocrine carcinoma.Some studies have found that it is composed of different subgroups with genomic features of small cell lung cancer (SCLC), NSCLC (mainly adenocarcinoma), and rare highly proliferative carcinoid tumors. 7It has a strong ability to distant metastasis, resulting in high mortality.Therefore, It is necessary for us to clarify its prognostic factors, which helpfully make accurate clinical decisions.In this study, we generated a novel nomogram to personalize the prediction of OS in LCNEC patients based on data obtained from the SEER database.We found that N stage, bone metastasis (yes or no), brain metastasis (yes or no), intrapulmonary metastasis (yes or no), and surgery at the primary site (yes or no) were independent prognostic factors.Among them, Surgery at the primary site favors the prognosis and prolongs their OS.Based on five key variables, the nomogram could calculate scores to quantitatively assess the prognosis.
In addition, we plotted the time-dependent ROC curves.According to the corresponding time-dependent AUC values, we know that the model has good discriminatory ability.We also plotted calibration curves and DCA to confirm that the nomogram had good consistency and clinical practicability.Based on the above results, our model can provide good guidance for further clinical evaluation and intervention.Nowadays, for pulmonary neuroendocrine tumors, the International Association for the Study of Lung Cancer recommends the application of TNM staging to predict their prognosis. 94][15][16][17] Meantime, the median overall survival (mOS) also varies, with stages I-IV ranging from 44-105, 19-28, 14-23, and 6-10.2 months, respectively. 5,18,19owever, the effects of T, N, and M stages on different subtypes of lung cancer are also different.T and N-staging are independent factors of OS in SCLC patients.The gap in OS between T 2 and T 3-4 staging is greater than that between T 1 and T 2 staging, and the difference in the effect of N 0 versus N 1 staging on OS is larger, whereas the difference in the effect of N 1 versus N 2 staging is smaller. 20Squamous carcinoma, a common NSCLC, also has T-staging and N-staging as independent influences.There is a large difference in OS between N 0 and N 1 stages, and among the T 1 -T 4 stages, T 3 , and T 4 stages have the largest difference in OS, while T 2 and T 3 have the smallest difference. 21In comparison to other types of lung cancer, T, N, and M staging each had a different impact on OS in patients with LCNEC.In this study, we found that the T stage had no significant influence on OS in the TNM stage of LCNEC patients, while the N and M stages (bone, lung, and brain metastases) were independent influences on OS.Among the N 0 -N 3 staging, the gap in OS between N 0 and N 1 staging was the largest, and N 1 and N 2 were the smallest.Of all the factors of significance, M staging had the greatest impact.Of these, brain metastases (yes or no) had the greatest impact, suggesting that the occurrence of brain metastases has the worst prognosis, which may be related to the higher rate of brain metastasis in patients with stage IV LCNEC. 19Moreover, we also found that the result when performing univariate analysis showed that liver metastases were unfavorable for prognosis, but multivariate  analysis showed no clear effect on the prognosis.This is different from the results of another study, so we need to continue exploring. 22Similar to the results of another study, 23 the result of this study also showed no clear correlation between tumor grade and patient OS.
For the treatment of patients with LCNEC, the National Cancer Control Network (NCCN) recommends treatment according to NSCLC guidelines.5][26] One study showed that chemotherapy was beneficial and radiotherapy had no significant benefit for the prognosis, 10 while another study showed that radiotherapy and chemotherapy were beneficial for OS in LCNEC patients. 11A retrospective study showed that surgery combined with chemotherapy was the best treatment for patients with stage I, II, and III LCNEC, and for stage IV patients, chemotherapy alone was more effective than other treatments. 30Lowczak et al demonstrated that patients with radical, negative surgical margins, lower CS, negative lymph nodes, and tumor size ≤4 cm have significantly better survival prognosis. 18These results suggest that surgical interventions are positively associated with patient prognosis.However, the benefits of chemotherapy and radiotherapy for LCNEC are controversial.Nevertheless, the sample sizes in these studies were small.In our large sample study, there was no clear benefit of radiotherapy or chemotherapy on the prognosis.However, there is a significant benefit of surgery on OS, which is similar to the previous results. 18n addition to radiotherapy and chemotherapy, emerging treatments for LCNEC are immunologic and targeted therapies. 31,32Moreover, cyclic RNA is an emerging biomarker that regulates cancer proliferation, apoptosis, migration, and invasion through multiple mechanisms.Its aberrant expression is present in almost all types of cancers. 33It has guiding significance for cancer diagnosis and prognosis and is expected to serve as a target for cancer treatment. 34Unfortunately, the SEER database does not have relevant data at this time and is to be updated at a later date.Of course, there are still some shortcomings in this study.Some factors that influence prognosis, such as underlying disease, Ki67, PD-L1 expression, and details of chemotherapy and radiotherapy, are not recorded in the SEER database, or a large amount of data is missing.Second, internal bias and limited significance are inevitable, constrained by retrospective data analysis and non-randomization.Therefore, in the future, we will try to include more covariates and extend the follow-up time in the clinic to further improve the predictive value of the model.

| CONCLUSION
In this study, we included basic information and treatment regimens in addition to TNM staging to analyze and construct a predictive model that more accurately calculates OS in patients with LCNEC.In addition, we verified that it has good discriminatory ability, predictive accuracy, and clinical practicability.

F
I G U R E 3 The receiver operating characteristic curve of the nomogram for the 12 and 24 months in the training set (A) and validation set (B). F I G U R E 5 The decision curve analysis of the nomogram for the 12 and 24 months in the training set (A, C) and validation set (B, D).

F I G U R E 4
The calibration curves of the nomogram for the 12 and 24 months in the training set (A, C) and validation set (B, D).
Baseline clinical characteristics of LCNEC patients.Grade: I, well differentiated; II, moderately differentiated; III, poorly differentiated; IV, undifferentiated.P: values calculated by chi-square test.Univariate and multivariate cox analyses in LCNEC patients.
T A B L E 1 T A B L E 2